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ABSTRACT 

We  present  new  computational  methodology  for  designing  polymers,  such  as 
polypeptides  and  polyelectrolytes,  which  can  selectively  recognize  nanostmctured  substrates. 

The  methodology  applies  to  polymers  which  might  be  used  to:  control  placement  and  assembly 
for  electronic  devices,  template  structure  during  materials  synthesis,  as  well  as  add  new 
biological  and  chemical  functionality  to  surfaces.  Optimization  of  the  polymer  configurational 
sequence  permits  enhancement  of  both  binding  energy  on  and  binding  selectivity  between  one  or 
more  atomistic  surfaces.  A novel  Continuous  Rotational  Isomeric  State  (CRIS)  method  permits 
continuous  backbone  torsion  sampling  and  is  seen  to  be  critical  in  binding  optimization  problems 
where  chain  flexibility  is  important.  We  illustrate  selective  polypeptide  binding  between  either 
analytic,  uniformly  charged  surfaces  or  atomistic  GaAs(  1 00),  GaAs(l  1 0)  and  GaAs(l  1 1 ) 
surfaces.  Computational  results  compare  very  favorably  with  prior  experimental  phage  display 
observations  [^R.  Whaley  et  al.  Nature,  405, 665  (2000)]  for  GaAs  substrates.  Further 
investigation  indicates  that  chain  flexibility  is  important  to  exhibit  selective  binding  between 
surfaces  of  similar  charge  density.  Such  chains  begin  with  sequences  which  repel  the  surfaces, 
continue  with  sequences  that  attract  the  surface  and  end  with  sequences  that  neither  attract  nor 
repel  strongly. 


INTRODUCTION 

We  present  new  computational  methodology  for  designing  polymers,  such  as 
polypeptides  and  polyelectrolytes,  which  can  selectively  recognize  nahostructured  substrates. 

The  methodology  applies  to  polymers^ wdiich  might  be  used  to:  control  placement  and  assembly 
for  electronic  devices,  template  structure  during  materials  synthesis,  as  well  as  add  new 
biological  and  chemical  functionality  to  surfaces.  Optimization  of  the  polymer  configurational 
sequence  permits  enhancement  of  both  binding  energy  on  and  binding  selectivity  between  one  or 
more  atomistic  surfaces.  This  optimization  is  enabled  by  combining  highly-efficient,  atomistic 
modeling  of  the  polymer  and  surfaces  with  genetic  mutation  of  the  polymer  configuration.  The 
atomistic  modeling  permits  the  calculation  of  macromolecular  statistics  and  thermodynamics  of 
substrate  binding,  while  genetic  sequence  mutation  enables  the  search  and  enhancement  of  the 
desired  polymer-surface  interactions. 

Previous  experimental  works  have  demonstrated  polypeptides  with  selectivity  for  binding 
to  surfaces  of  metals  and  metal  oxides  [1-8]  as  well  as  a range  of  semiconductor  surfaces  [9]. 
Polypeptides  which  can  recognize  desired  surfaces  are  typically  selected  from  a library  of  several 
million  candidates  using  either  bare  proteins  or  phages,  often  in  the  presence  of  surfactants  or 
salts.  These  methods  are  often  both  practical  and  useful.  There  still  exist  several  issues.  First,  it  is 
not  always  clear  whether  the  selected  polypeptides  w'ill  retain  their  binding  and  selectivity  once 
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removed  from  the  parent  protein  or  phage  body.  Second,  practical  experimental  libraries  of  even 
1 o’*  candidates  might  not  well  represent  the  complete  range  of  ftmctionalities  present  in  the  > 1 0 ' 
possibilities  from  natural  residues.  Third,  experimental  screening  does  not  typically  teach  why 
particular  consensus  sequences  emerge.  Hence  we  might  not  always  be  able  to  predict  new  and 
better  binding  sequences.  Finally,  we  ask  if  it  is  possible  to  design  better  polymer  sequences  and 
compositions  than  those  available  from  natural  sources  and  residues. 

If  theoretical  and  computational  methods  are  to  be  as  practical  and  useful,  they  will 
surely  need  to  contain  the  salient  physics  and  chemistries  of  polymers  and  surfaces  while 
remaining  both  accurate  and  quickly  solvable.  Toward  this  end,  we  illustrate  methodology  for 
selective  polypeptide  binding  between  cither  analytic,  uniformly-charged  surfaces  or  atomistic 
model  surfaces.  Here  we  compare  our  prcliminaiy  findings  to  recent  pioneering,  experimental 
results  [9].  Further,  we  ask  how  to  find  optimal  sequences  which  selective  a target  surface  over 
closely  similar  surfaces. 

COMPUTATIONAL  METHODS 

Polypeptides  arc  described  as  rotational  isomeric  state  chains  in  which  bond  lengths  and 
bond  angles  arc  frozen  at  equilibrium  values  while  torsional  rotations  remain  degrees  of  freedom. 
In  applying  discrete  Rotational  Isomeric  State,  RIS,  theory  [sec  c.g.  10-12],  we  select  discrete 
energy  states  at  the  minima  in  a potential  energy  surface  from  mapping  pairwise-conditional, 

rotational  angles  around  neighboring  N-Co,  and  Ca-C’  bonds  for  each  natural  amino  acid  residue. 
In  applying  our  new  Continuous  Rotational  Isomeric  State,  CRIS,  method,  the  torsional  angles 
may  be  selected  within  a continuous  range  from  rectangular  tiles  around  minima  in  the  potential 
energy  surface.  The  tiles  arc  defined  from  energy  minima  bounded  by  a preset,  maximum  well 
height,  typically  I kcal/mol.  If  a local  potential  energy  surface  exhibits  a relative  maximum 
before  reaching  the  preset  height,  then  the  tile  boundary  is  defined  at  the  relative  maximum.  Tile 
boundaries  arc  combined  by  overall  union  if  the  tile  definitions  create  overlapping  regions  from 
multiple  energy  minima.  A chain  backbone  conformation  is  completely  defined  from  the  fixed 
bond  lengths,  fixed  bond  angles  and  selected  torsions.  Non-backbone  atom  positions  are 
described  as  “pendtors”,  i.e.  pendant  vectors.  Amide  hydrogens  and  oxygens  are  placed  on  the 
backbone  from  constant  vectorial  components  using  a basis  set  generated  from  the  [C’-N],  [N- 
Ca]  bond  vectors  and  their  cross-product.  Likew'ise  each  residue’s  pendant  atoms  are  placed  fiom 
constant  vectorial  components  using  a basis  set  generated  from  the  [N-CJ,  [C,,  -C’]  bond  vectors 
and  their  cross-product.  The  various  rotational  state's  within  each  residue’s  pendant  group  can  be 
represented  by  sets  of  their  atomic  pendtors.  A rotational  potential  energy  surface  and  the 
aforementioned  fixed  geometrical  parameters  for  each  amino  acid  residue  with  amide 
teniiinations  was  created  using  the  PCFF  forcefield  [13,14].  Wlrcn  implementing  RIS,  a table  of 
discrete  rotational  states  (each  state  comprising  the  conditional  pair  of  torsion  angles  and  the 
associated  energy  value)  for  each  residue  is  stored  in  memory  during  a simulation  to  look-up  the 
rotational  energy.  When  implementing  CRIS  the  entire  pairwise-conditional  torsional  energy 
surface  for  each  residue  is  simply  stored  in  memory  during  a simulation  to  look-up  and 
interpolate  the  rotational  energy. 

The  polypeptide  potential  energy  includes  additional  interatomic  contributions.  Self- 
avoidance is  ensured  by  assigning  hard-sphere  radius,  typically  0.5  A,  for  each  atom.  The 
hydrophobic  effect  is  approximated  by  contributing  a fixed  energy  decrement,  typically  -0.25 
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kcal/tnol,  when  two  hydrophobic  groups  reach  a minimum  separation,  typically  SA.  Electrostatic 
potential  energy  between  atoms  arises  from  partial  charges  for  each  atom  assigned  by  the 
COMPASS  forcefield  [15].  Since  the  polypeptide  is  ensconced  in  an  effective  solvent  medium, 
atoms  experience  diminished  electrostatic  potentials,  K,  through  the  Debye-Huckel  potential. 


where  the  first  enclosed  term  is  the  thermal  Bjerrum  length,  qj  is  the  partial  charge  on  the  ith 
atom,  e is  the  solvent  dielectric  constant,  k is  the  inverse  electrostatic  screening  or  Debye  length 
which  is  dictated  in  actual  experiments  by  the  concentration  of  dissolved  ions  and  ru  is  the 
distance  between  the  ith  and  Jth  atoms.  Equation  (1)  applies  also  between  polymer  and  surface 
atoms.  For  infinite  analytic  surfaces  with  uniform  charge,  we  compute  the  potential  between 
atoms  and  the  analytic  surface  using  the  integrated  form  of  Equation  ( 1 ) below: 


where  o is  the  surface  charge  density  and  Zj  is  the  height  above  the  surface  of  the  ith  atom. 

Our  simulations  utilize  an  unusual  methodology  to  sample  the  polymer  degrees  of 
freedom  (sometimes  referred  as  “simple  sampling  Monte  Carlo”  [1 6]  or  direct  phase-space 
integration)  for  computing  molecular  statistics  and  thermodynamics.  The  degrees  of  freedom 
consist  of;  the  internal  torsion  states  along  the  polymer  backbone,  the  position  of  an  end  bond 
vector  relative  to  the  surface  origin  and  a rigid-body  rotation  of  the  chain  around  the  end  bond 
vector.  Torsional  states  are  selected  randomly  such  that  each  discrete  state  (for  RIS)  or  position 
in  a torsional  tile  state  (for  CRIS)  is  selectable  with  equal  and  uniform  probability.  Phage  display 
and  tumble  chain  sampling  methods  are  used  to  integrate  over  the  remaining  spatial  degrees  of 
freedom.  Phage  display  sampling  always  assigns  the  C’-terminus  bond  vector  normal  to  the 
surface  (with  the  penultimate  bond  vector  toward  the  surfaee)  and  choses  a random,  rigid-body 
rotation  about  the  end  bond  vector.  Note  that  the  polypeptide  C’-terminus  is  fixed  so  that  the  N- 
terminus  can  be  displayed  to  the  surface  to  emulate  a phage  peptide  display.  Tumble  sampling 
comprises  selecting  a random  spatial  orientation  of  an  end  bond  vector  and  selecting  a random 
rigid-body  rotation  about  the  end  bond  vector.  For  both  sampling  methods  the  absolute  distance 
between  the  lowest  atom  in  the  polymer  and  the  highest  component  of  the  surface  is  varied  to 
sample  a profile  of  important  statistical  quantities  as  a function  of  height  above  the  surface.  Since 
all  Monte  Carlo  trials  are  always  “accepted”,  statistical  quantities  are  computed  with  each 
configuration  being  weighted  by  its  appropriate  thermal  Boltzmann  factor,  exp[-(E-Efc,rm)/kT]. 
Here  E is  the  total  potential  energy  of  a Monte  Carlo  trial  and  Eform  is  the  energy  of  formation. 
Statistical  quantities  of  interest  include  the  well  known  polymer-surface  binding  free  energy,  A, 
internal  energy,  U,  entropy,  S,  binding  constant,  K,  as  well  as  geometrical  shape  changes,  e.g. 
strain,  e,  and  squash, 


' f surface  , 

(rA 

\ 'solvent 

^ - 

(«2) 

\ ' solvent 

(3a,b) 


I lore  wc  define  a strain  in  terms  of  the  mean-squared  end-to-end  chain  distance  on  the  surface, 
<R">siiriiii:e,  rclativc  to  frcc  solvent,  <R^>soivem-  The  chain  squash  is  defined  in  terms  of  the  mean- 
squared,  chain  end-to-end  vector  components  parallel  to  the  surface  (designated  by  horizontal 
arrows)  and  normal  to  the  surface  (designated  by  vertical  arrows). 


RESULTS 


Figure  1 illustrates  the  typical  convergence  of  a polypeptide-surface  binding  free  energy 
as  a ftinction  of  the  number  of  Monte  Carlo  samples.  This  example  polypeptide  has  over  1 0^ 
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Figure  1.  Polypeptide-surface  binding  free  energy  as  a function  of  Monte  Carlo  samples  using 
CRIS  chain  model.  The  test  polypeptide  is  presented  a flat  surface  with  uniform  charge  density 
0.1  e/A^.  Error  bars  indicate  the  standard  deviation  from  10  independent,  replicated  simulations. 


discrete  torsional  degrees  of  freedom  yet  the  free  energy  has  less  than  1 .5  kcal/mol  uncertainty 
after  10'  phage  samples  and  10^  tumble  samples.  Note  that  this  particular  chain  is  a very  strong 
binder.  Metropolis  Monte  Carlo  methods  typically  do  not  converge  as  quickly  [17]  and  require 
extensive  equilibration  to  surmount  chain  conformation  trapping  in  deep  potential  energy  wells 
which  is  intrinsic  to  these  systems  [18]. 

Our  computational  results  compare  favorably  with  previously-published,  experimental 
phage  display  observations,  Wc  subjected  model  RIS  chains  to  phage  sampling  over  an  atomistic 
GaAs(IOO)  surface  model.  The  residue  sequences  match  the  pill  coat  proteins  of  M13  coliphages 
reported  in  Figure  I of  reference  [9]  which  were  found  to  bind  to  GaAs(IOO)  substrates.  While 
these  preliminary  computations  predict  8 of  the  1 1 polypeptides  have  very  favorable  binding  free 
energies  (AA  < 0),  all  chains  arc  predicted  to  bind  significantly  to  the  surface.  The  statistical 
binding  constant,  K,  takes  a value  of  unity  for  permanent  surface  binding  and  zero  for  no  surface 
binding.  Our  computed  values  of  K range  from  0.4  to  0.9,  which  is  consistent  with  elutablc 
surface  binding.  Typically,  we  find  that  those  chains  with  the  lowest  binding  free  energies  also 
show  the  greatest  tendency  to  spread  across  the  surface  (high  values  of  E and  ^).  Most 
significantly,  our  computations  do  reproduce  experimental  observations  that  clone  G 1-3  exhibits 
preferential  binding  to  GaAs(lOO),  over  both  gallium  and  arsenic  tenninated  GaAs(l  1 1)  faces. 
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TABLE  I.  Binding  results  for  RIS  chain  models  of  phage  display  polypeptides  (see  Figure  1 of 
reference  [9])  tested  by  phage  sampling  over  atomic  GaAs(  1 00)  model. 


Phage  [9] 

AA 

AU 

ASxiO’ 

K 

e 

Gl-3 

-1.6 

-1.9 

-0.9 

0.9 

2.19 

0.45 

Gl-4 

-0.6 

0.5 

3.8 

0.7 

1.64 

-1.80 

G7-4 

0.3 

-0.6 

-3.1 

0.4 

-0.19 

-0.10 

Gll-3 

0.2 

1.0 

2.6 

0.4 

-0.16 

-0.17 

GI2-3 

-0.9 

1.3 

7.3 

0.8 

0.26 

0.94 

G12-4 

-0.5 

-1.2 

-5.6 

0.7 

-0,18 

0.10 

G12-5 

0.3 

-0.6 

-2.8 

0.4 

-0.27 

-0.06 

G13-5 

-1.0 

1.1 

7.0 

0.8 

0.58 

1.66 

G14-3 

-1.0 

0.8 

6.0 

0.9 

1.62 

0.93 

GI4-4 

-0.4 

-0.2 

0.9 

0.7 

0.28 

1.41 

G15-5 

-0.8 

0.2 

3.5 

0.8 

0.20 

-0.14 

In  a separate  set  of  computational  experiments,  we  explore  how  to  construct  chains  to 
bind  selectively  to  only  modestly,  Lewis  acidic  surfaces.  Sequences  were  limited  to  the  residue- 
pair  combinations  of;  KP  (stiff  base),  KG  (flexible  base),  DP  (stiff  acid),  DG  (flexible  acid),  GG 
(flexible  neutral)  and  PP  (stiff  neutral).  Furthemore  we  compare  fully  atomistic  chain  models  to 
united-atom  chain  models  (where  each  backbone  entity  possesses  the  net  charge  from  all  of  its 
pendant  atoms  while  preserving  RIS  properties  for  each  residue).  Not  surprisingly,  we  find 
chains  with  high  base  residue  content  bind  strongly  to  surfaces  with  increasing  surface  charge 
density,  but  this  is  only  trivial  selectivity.  More  interestingly,  we  found  only  chains  with  flexible 
residue  pairs  exhibit  non-monotonic  binding  to  surfaces  with  increasing  surface  charge  density. 
Table  II  illustrates  sequences  with  optimized  binding  constants  for  intermediate-valued  surface 
charge  densities.  Such  chains  begin  with  base  sequences  which  repel  the  surfaces,  continue  with 
acid  sequences  that  attract  the  surface  and  end  with  neutral  sequences  that  neither  attract  nor 
repel  strongly.  This  suggest  that  several  types  of  star  polymers  could  exhibit  interesting  selective 
binding  properties. 

TABLE  II.  Surface  binding  constant,  K,  for  RIS  united  atom  chain  models  on  analytic  surfaces. 

Surface  Charge  Density  (e/A^) 


Sequence 

0,20 

0.16 

0.12 

0.08 

0.04 

(KP)2(DP)2(G),2 

0.90 

0.98 

0.82 

0.48 

0.51 

(KG)2(DG).(G),2 

0.92 

1.00 

0.69' 

0-51  • 

■ 0.52 

(KG)2(DP),  (G)i2 

0.96 

0.96 

.0.46 

0.55 

0.51 

(KG)2  (DG)2(P)i2 

0.98 

0.74 

0,86 

0.63 

0.56 
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